Leading Professional Society for Computational Biology and Bioinformatics
Connecting, Training, Empowering, Worldwide

banner

Posters - Schedules

Poster presentations at ISMB/ECCB 2021 will be presented virtually. Authors will pre-record their poster talk (5-7 minutes) and will upload it to the virtual conference platform site along with a PDF of their poster beginning July 19 and no later than July 23. All registered conference participants will have access to the poster and presentation through the conference and content until October 31, 2021. There are Q&A opportunities through a chat function and poster presenters can schedule small group discussions with up to 15 delegates during the conference.

Information on preparing your poster and poster talk are available at: https://www.iscb.org/ismbeccb2021-general/presenterinfo#posters

Ideally authors should be available for interactive chat during the times noted below:

View Posters By Category

Session A: Sunday, July 25 between 15:20 - 16:20 UTC
Session B: Monday, July 26 between 15:20 - 16:20 UTC
Session C: Tuesday, July 27 between 15:20 - 16:20 UTC
Session D: Wednesday, July 28 between 15:20 - 16:20 UTC
Session E: Thursday, July 29 between 15:20 - 16:20 UTC
A Graph Feature Auto-Encoder for the Prediction of Unobserved Node Features on Biological Network
COSI: Special Session 01
  • Tom Michoel, Computational Biology Unit, Department of Informatics, University of Bergen, Norway
  • Ramin Hasibi, Computational Biology Unit, Department of Informatics, University of Bergen, Norway

Short Abstract: In this work, we study the representation of transcriptional, protein-protein and genetic interaction networks in E. coli and mouse through integrating the gene expression values with network structures leveraging the Graph Auto Encoders. Our results indicate that such representations explain a large proportion of variation in gene expression data, and that using gene expression data as node features improves the reconstruction of the graph from the embedding. We further propose a new end-to-end Graph Feature Auto-Encoder framework for the prediction of node features utilizing the structure of the gene networks, which is trained on the feature prediction task, and show that it performs better at predicting unobserved gene expression values than regular MultiLayer Perceptrons. When applied to the problem of imputing missing data in single-cell RNAseq data, the graph feature auto-encoder utilizing our new graph convolution layer called FeatGraphConv outperformed a state-of-the-art imputation method that does not use protein interaction information, showing the benefit of integrating biological networks and omics data with our proposed approach.

A Graph Feature Auto-Encoder for the Prediction of Unobserved Node Features on Biological Network
COSI: Special Session 01
  • Tom Michoel, Computational Biology Unit, Department of Informatics, University of Bergen, Norway
  • Ramin Hasibi, Computational Biology Unit, Department of Informatics, University of Bergen, Norway

Short Abstract: In this work, we study the representation of transcriptional, protein-protein and genetic interaction networks in E. coli and mouse through integrating the gene expression values with network structures leveraging the Graph Auto Encoders. Our results indicate that such representations explain a large proportion of variation in gene expression data, and that using gene expression data as node features improves the reconstruction of the graph from the embedding. We further propose a new end-to-end Graph Feature Auto-Encoder framework for the prediction of node features utilizing the structure of the gene networks, which is trained on the feature prediction task, and show that it performs better at predicting unobserved gene expression values than regular MultiLayer Perceptrons. When applied to the problem of imputing missing data in single-cell RNAseq data, the graph feature auto-encoder utilizing our new graph convolution layer called FeatGraphConv outperformed a state-of-the-art imputation method that does not use protein interaction information, showing the benefit of integrating biological networks and omics data with our proposed approach.

Antibody humanization using deep learning and natural antibody repertoires
COSI: Special Session 01
  • David Prihoda
  • Andrew Waight
  • Veronica Juan
  • Laurence Fayadat-Dilman
  • Daniel Svozil
  • Danny A. Bitton
Efficient Design of Optimized AAV Capsids using Multi-property Machine Learning Models Trained across Cells, Organs and Species
COSI: Special Session 01
  • Patrick McDonel
  • Jeff Jones
  • Jorma Gorns
  • Justin Yan
  • Kathy Lin
  • Lauren Wheelock
  • Megan Cramer
  • Michael Stiffler
  • Nishith Nagabhushana
  • Jeff Gerold
  • Roza Ogurlu
  • Sam Sinai
  • Sam Wolock
  • Shireen Abestesh
  • Stephen Malina
  • Stephen Northup
  • Sylvain Lapan
  • Eryney Marrogi
  • Adrian Veres
  • Alexander Brown
  • Amir Shanehsazzadeh
  • Anna Wec
  • Cem Sengel
  • Chris Reardon
  • Elina Locane
  • Eric Kelsic
  • Farhan Damani
  • Flaviu Vadan
  • Hanna Mendes
  • Heikki Turunen
  • Helene Kuchwara
  • Jakub Otwinowski
  • James Oswald
  • Jamie Kwasnieski

Short Abstract: While next-gen high-throughput assays enable us to learn how capsid sequence changes affect capsid functionality, measuring and optimizing capsid properties in the most therapeutically relevant models, such as non-human primates (NHP), remains challenging. The rate of transduction in target organs is lower than ideal, and most of the sequence space is non-functional. To overcome these challenges, we investigated to what extent multi-task machine learning can improve the efficiency of AAV capsid design for high-performing capsids. We apply our method to a previously designed library containing 156,858 designed sequence variants derived from a natural AAV capsid serotype and measured their properties as delivery vectors. MPMs provide a coherent framework in which to connect information from experiments across cell lines, organs, and species to the most relevant outcomes in NHP studies, thereby reducing the high resource and ethical burdens of NHP experimentation. Additionally, MPMs help overcome data sparsity in traits that are hard to measure, thereby improving model accuracy and providing a more reliable interpretation of experimental results. With further refinement, MPMs will enable the design of highly optimized AAV capsids that open new frontiers in delivery, toward realizing the full potential of gene therapy.

Evaluation Of Convolutional Neural Networks Containing Interactions Between Genomic Motifs
COSI: Special Session 01
  • Bernhard Renard, Hasso Plattner Institute, Germany
  • Marta Lemanczyk, Hasso-Plattner-Institute, Germany, Germany
Evotuning protocols for Transformer-based variant effect prediction on multi-domain proteins
COSI: Special Session 01
  • Hideki Yamaguchi
  • Yutaka Saito

Short Abstract: Accurate prediction of variant effects has broad impacts on protein engineering. Recent machine learning approaches toward this end are based on representation learning, often using large-scale, diverse datasets. However, it is still unclear how we can effectively learn the intrinsic evolutionary properties of an engineering target protein, specifically when the protein is composed of multiple domains. Additionally, no optimal protocols are established for incorporating such properties into Transformer-based variant effect predictors. In response, we propose evolutionary fine-tuning, or “evotuning”, protocols, considering various combinations of homology search, fine-tuning, and sequence embedding strategies, without the need for multiple sequence alignment. Exhaustive evaluations on diverse proteins indicate that the models obtained by our protocols achieve significantly better performances than previous methods. The visualizations of attention maps suggest that the structural information can be incorporated by evotuning without direct supervision, possibly leading to better prediction accuracy.

Evotuning protocols for Transformer-based variant effect prediction on multi-domain proteins
COSI: Special Session 01
  • Hideki Yamaguchi
  • Yutaka Saito

Short Abstract: Accurate prediction of variant effects has broad impacts on protein engineering. Recent machine learning approaches toward this end are based on representation learning, often using large-scale, diverse datasets. However, it is still unclear how we can effectively learn the intrinsic evolutionary properties of an engineering target protein, specifically when the protein is composed of multiple domains. Additionally, no optimal protocols are established for incorporating such properties into Transformer-based variant effect predictors. In response, we propose evolutionary fine-tuning, or “evotuning”, protocols, considering various combinations of homology search, fine-tuning, and sequence embedding strategies, without the need for multiple sequence alignment. Exhaustive evaluations on diverse proteins indicate that the models obtained by our protocols achieve significantly better performances than previous methods. The visualizations of attention maps suggest that the structural information can be incorporated by evotuning without direct supervision, possibly leading to better prediction accuracy.

Graph attention network based representation learning for cancer drug response prediction and interpretation
COSI: Special Session 01
  • Dionizije Fa, Ruđer Bošković Institute, Croatia
  • Frank Supek, Institute for Research in Biomedicine, Spain
  • Tomislav Šmuc, Ruđer Bošković Institute, Croatia

Short Abstract: We present a state of the art multimodal deep learning model for cancer drug response prediction based on pharmacogenomic data. We featurize cell lines as protein-protein interaction graphs. Graph attention networks then allow us to examine potentially plausible biological interactions in protein-protein interactions graphs by examining the attention coefficients.

Guided Generative Protein Design using Regularized Transformers
COSI: Special Session 01
  • Smita Krishnaswamy, Yale University, United States
  • Egbert Castro
  • Abhinav Godavarthi
  • Julian Rubinfien
HydrAMP: a deep generative model for antimicrobial peptide discovery
COSI: Special Session 01
  • Paulina Szymczak, Faculty of Mathematics, Informatics and Mechanics of the University of Warsaw, Poland
  • Marcin Możejko, Faculty of Mathematics, Informatics and Mechanics of the University of Warsaw, Poland
  • Tomasz Grzegorzek, Faculty of Mathematics, Informatics and Mechanics of the University of Warsaw, Poland
  • Marta Bauer, Medical University of Gdańsk, Poland
  • Wojciech Kamysz, Medical University of Gdańsk, Poland
  • Damian Neubauer, Medical University of Gdańsk, Poland
  • Michał Michalski, The Centre of New Technologies, University of Warsaw, Poland
  • Piotr Setny, The Centre of New Technologies, University of Warsaw, Poland
  • Jacek Sroka, Faculty of Mathematics, Informatics and Mechanics of the University of Warsaw, Poland
  • Ewa Szczurek, Faculty of Mathematics, Informatics and Mechanics of the University of Warsaw, Poland

Short Abstract: The development of resistance to conventional antibiotics in pathogenic bacteria poses global health hazard. Antimicrobial peptides (AMPs) are an emerging group of compounds with the potential to become the new generation of antibiotics. Deep learning methods are widely used by wet-laboratory researchers to screen for the most promising candidates. We propose HydrAMP - a generative model based on a semi-supervised variational autoencoder, that can generate new AMPs, and perform analogue discovery. Novel features of our approach include: non-iterative training, parameter-regulated model creativity, and improvement of existing AMPs. We introduced multiple refinements to latent space modelling that allow us to sample novel AMPs despite the data scarcity. The peptides generated by HydrAMP are similar to the known AMPs in terms of physicochemical properties. We have successfully obtained and verified experimentally a new, more active analogue of Pexiganan, proving that HydrAMP is able to find potent analogues for existing peptides. The learnt representation enables fast and efficient discovery of peptides with desired biological activity.

Light Attention Predicts Protein Location from the Language of Life
COSI: Special Session 01
  • Hannes Stärk, Department of Informatics, Technical University of Munich, Germany
  • Christian Dallago, Department of Informatics, Technical University of Munich, Germany
  • Michael Heinzinger, Department of Informatics, Technical University of Munich, Germany
  • Burkhard Rost, Department of Informatics, Technical University of Munich, Germany

Short Abstract: Although knowing where a protein functions in a cell is important to characterize biological processes, this information remains unavailable for most known proteins. Machine learning narrows the gap through predictions from expertly chosen input features leveraging evolutionary information that is resource expensive to generate. We showcase using embeddings from protein language models for competitive localization predictions not relying on evolutionary information. Our lightweight deep neural network architecture uses a softmax weighted aggregation mechanism with linear complexity in sequence length referred to as light attention (LA). The method significantly outperformed the state-of-the-art for ten localization classes by about eight percentage points (Q10). The novel models are available as a web-service and as a stand-alone application at embed.protein.properties.

MULocDeep: A deep-learning framework for protein subcellular and suborganellar localization prediction with residue-level interpretation
COSI: Special Session 01
  • Dong Xu, Univ. of Missouri-Columbia, United States
  • Duolin Wang, University of Missouri-Columbia, United States
  • Yuexu Jiang
  • Yifu Yao
  • Holger Eubel
  • Patrick Kunzler
  • Ian Max Moller
Multimodal data visualization and denoising with integrated diffusion
COSI: Special Session 01
  • Manik Kuchroo
  • Abhinav Godavarthi
  • Guy Wolf
ProteinBERT: A universal deep-learning model of protein sequence and function
COSI: Special Session 01
  • Nadav Brandes, The Hebrew University of Jerusalem, Israel
  • Dan Ofer, The Hebrew University of Jerusalem, Israel
  • Michal Linial, The Hebrew University of Jerusalem, Israel
  • Yam Peleg
  • Nadav Rappoport
Random Walk-­based Matrix Factorization of a Multi­layer Network for Protein Function Prediction
COSI: Special Session 01
  • Surabhi Jagtap, CentraleSupelec; IFP Energies nouvelles, France
  • IFP Energies nouvelles IFP Energies nouvelles, IFP Energies nouvelles, France
  • Frederique Bidard, IFP Energies nouvelles, France
  • Laurent Duval, IFP Energies nouvelles, France
  • Fragkiskos D. Malliaros, CentraleSupelec, France

Short Abstract: Cellular systems of organisms are composed of multiple interacting entities that control cellular processes at multiple levels by tightly regulated molecular networks. In recent years, the advent of high-throughput experimental methods has resulted in the increase of large-scale molecular and functional interaction networks such as gene co-expression, protein–protein interaction (PPI) , genetic interaction, and metabolic networks. These networks are rich source[s] of information that could be used to infer the functional annotations of genes or proteins. Extracting relevant biological information from their topologies essential in understanding the functioning of the cell and its building blocks (proteins). Therefore, it is necessary to obtain an informative representation of the proteins and their proximity that is not fully captured by features that are extracted directly from single input networks. Here, we propose BraneMF, a random walk-based matrix factorization of a multi-layer network for protein function prediction.

Random Walk-­based Matrix Factorization of a Multi­layer Network for Protein Function Prediction
COSI: Special Session 01
  • Surabhi Jagtap, CentraleSupelec; IFP Energies nouvelles, France
  • IFP Energies nouvelles IFP Energies nouvelles, IFP Energies nouvelles, France
  • Frederique Bidard, IFP Energies nouvelles, France
  • Laurent Duval, IFP Energies nouvelles, France
  • Fragkiskos D. Malliaros, CentraleSupelec, France

Short Abstract: Cellular systems of organisms are composed of multiple interacting entities that control cellular processes at multiple levels by tightly regulated molecular networks. In recent years, the advent of high-throughput experimental methods has resulted in the increase of large-scale molecular and functional interaction networks such as gene co-expression, protein–protein interaction (PPI) , genetic interaction, and metabolic networks. These networks are rich source[s] of information that could be used to infer the functional annotations of genes or proteins. Extracting relevant biological information from their topologies essential in understanding the functioning of the cell and its building blocks (proteins). Therefore, it is necessary to obtain an informative representation of the proteins and their proximity that is not fully captured by features that are extracted directly from single input networks. Here, we propose BraneMF, a random walk-based matrix factorization of a multi-layer network for protein function prediction.

Representation Reprogramming via Dictionary Learning (R2DL) for adversarially reprogramming pretrained sequence and token language models for molecular learning tasks
COSI: Special Session 01
  • Ria Vinod
  • Pin-Yu Chen
  • Payel Das
Toy dataset for the molecular recognition problem
COSI: Special Session 01
  • Georgy Derevyanko
  • Siddarth Bhadra-Lobo
  • Guillaume Lamoureux

Short Abstract: Predicting the physical interaction of proteins is a cornerstone problem in computational biology. New classes of learning-based algorithms are actively being developed, and are typically trained end-to-end on protein structures extracted from the Protein Data Bank. These training datasets tend to be large and difficult to use for prototyping and, unlike image or natural language datasets, they are not easily interpretable by non-experts. In this paper we propose Dock2D-IP and Dock2D-FI, two toy datasets that can be used to select algorithms predicting protein-protein interactions---or any other type of molecular interactions. Using two-dimensional shapes as input, each example from Dock2D-FI describes the fact of interaction between two shapes and each example from Dock2D-IP describes the interaction pose of two shapes known to interact. With the hope that it will stimulate further research, we also propose a number of baselines that represent different approaches to the problem.

Translation Initiation Site Prediction Using Deep Learning and Synthetic Datasets
COSI: Special Session 01
  • Wesley De Neve
  • Yunseol Park
  • Espoir Kabanga
  • Jasper Zuallaert
  • Hyunjin Shim
  • Arnout Van Messem

Short Abstract: Building a prediction model for translation initiation sites (TISs) and determining their important features may aid in uncovering new translation mechanisms and give emphasis to already existing ones. However, interpretation is difficult, as many machine learning models are black box in nature. Therefore, to better understand the features relevant to the predictions made, we investigate the use of synthetic data in the context of TIS prediction for A. thaliana and, through transfer learning, for H. sapiens. From our experiments, we found that the model trained with synthetic data (SBBM) and the model trained with real data (RBBM) learn from similar features. Furthermore, the model trained with both real and synthetic data (CBBM), obtained a similar effectiveness as RBBM. We also found that CBBM could be used to reduce overfitting when training with small datasets. In addition, we observed that consensus sequence and nucleotide frequency are the most positively influencing features, while codon usage was found to be a negatively influencing feature. Finally, the models seemed to learn leaky scanning, as shown by the less influential nature of upstream ATG. Through this case study on TIS prediction, we were able to gain insight into (1) the potential of leveraging synthetic data for the interpretation of black-box prediction models and (2) the prediction potential of models trained using a combination of synthetic and real data.

Translation Initiation Site Prediction Using Deep Learning and Synthetic Datasets
COSI: Special Session 01
  • Wesley De Neve
  • Yunseol Park
  • Espoir Kabanga
  • Jasper Zuallaert
  • Hyunjin Shim
  • Arnout Van Messem

Short Abstract: Building a prediction model for translation initiation sites (TISs) and determining their important features may aid in uncovering new translation mechanisms and give emphasis to already existing ones. However, interpretation is difficult, as many machine learning models are black box in nature. Therefore, to better understand the features relevant to the predictions made, we investigate the use of synthetic data in the context of TIS prediction for A. thaliana and, through transfer learning, for H. sapiens. From our experiments, we found that the model trained with synthetic data (SBBM) and the model trained with real data (RBBM) learn from similar features. Furthermore, the model trained with both real and synthetic data (CBBM), obtained a similar effectiveness as RBBM. We also found that CBBM could be used to reduce overfitting when training with small datasets. In addition, we observed that consensus sequence and nucleotide frequency are the most positively influencing features, while codon usage was found to be a negatively influencing feature. Finally, the models seemed to learn leaky scanning, as shown by the less influential nature of upstream ATG. Through this case study on TIS prediction, we were able to gain insight into (1) the potential of leveraging synthetic data for the interpretation of black-box prediction models and (2) the prediction potential of models trained using a combination of synthetic and real data.



International Society for Computational Biology
525-K East Market Street, RM 330
Leesburg, VA, USA 20176

ISCB On the Web

Twitter Facebook Linkedin
Flickr Youtube